Microbial bioremediation of persistent organic pollutants (POP)
Group 6: Javier López, Pablo Sánchez-Izquierdo, Víctor Fiérrez and Laura Casanovas
OUR DATA
POP (persistent organic pollutants): clean-up technique for reclaiming POP-contaminated environments. It is attractive, eco-friendly and cost-efficient.
Manually curated integrative database dedicated to microbial bioremediation of persistent organic pollutants (POP) research.
- Genes
- Strains
- Sequences (Complementary)
- Compound
- Calculated properties 01
We added the corresponding sequence to each sample.
Final dataset size: 5733 observations of 12 variables
GOAL
Elucidate patterns within organisms with bioremediation potential. To promote more effective and sustainable solutions.
![]()
CLEANING
- Select columns of interest: Compound Name, Enzyme Name, Encoding Gene, KEGG Orthology, Organism, GenBankID, Strain ID/Microorganism, UniProt ID, Protein ID, Continent, Country, Isolation source, Habitat notes
- Deal with null values
- Delete columns with majority null values (except from KEGG Orthology)
- Change variable names
- Standarization of naming convention of: Encoding Gene, Continent, Country
DATA EXPLORATION: ORGANISM DISTRIBUTION
We will do our study only with Bacteria samples.
![]()
DATA EXPLORATION: GENES VS COMPOUNDS
![]()
- Too many genes, is there anyting we can do?
RESULTS: Orthology Analysis
![]()
RESULTS: Sequence Analysis
CONCLUSIONS
- Bioremediation organisms studied up to date are mainly Bacteria
- There are still a lot of countries to be explored
- Chlorocyclohexane and chlorobenzene degradation pathway is involved in the bioremediation of many POP compounds
- Core sequence LOGO for genes degrading the same compounds have been determined by multiple sequence alignment
- Core sequence LOGO are very different depending on what compound they target and they can be grouped by proximity by performing a phylogenetic tree analysis